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ABSTRACT 



The normal curve has long been important in 



statistics. Most interval variables yield normal or quasi-normal 
distributions when data are collected from large samples, and the 
normal "Z" distribution is also used as a test statistic (e.g., to 
test differences between two means when sample size is large, since 
"t" approaches "Z" as degrees of freedom increase). Thus, almost all 
statistics books discuss the normal curve. Nevertheless, many 
researchers do not fully understand sri\e concepts related to the 
normal curve, such as skewnfiss and kurtosis statistics, because these 
two statistics often receive cursory instructional treatment, given 
the press for instructional time. This paper illustrates that shape 
statistics remove the influence of distribution variability (i.e., 
shape statistics always initially involve the conversion of raw 
scores to "Z" form, SD=1=V, so that impact of variability is held 
constant) . Nine figures illustrate the shape statistics, and one 
table lists raw scores and "Z" scores. An eight-item list of 
references is included. (Author/SLD) 
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Abstract 



The normal curve has long been important in statistics. Most i.nterval 
variables yield normal or quasi-normal distributions when data are collected 
from large samples, and the normal Z distribution is also used as a test 
statistic, e.g., to test differences between two means when sample size is large, 
since t approaches Z as degrees of freedom increase. Thus, almost all statistics 
books discuss the normal curve. Nevertheless, many researchers do not fully 
understand some concepts related to the normal curve, such as skewness and 
kurtosis statistics, because these two statistics often receive cursory 
instructional treatment, given the press for instructional timej. This paper 
illustrates that shape statistics remove the influence of distribution 
variability, i.e., shape statistics always initially involve the conversion of raw 
scores to Z form (SD=l=y) so that the impact of variability , is held constant. 



The Normal Curve Takes Many Forms: A Review of Skewness and Kurtosis 

The normal curve has many useful mathematical properties. For example, the 
percentage of people scoring within SD units from the mean is always known. 
Thus, 68% of scores fall between the mean and plus or minus one SD in a normal 
distribution, 95% fall between the mean plus or minus two SDs, and 99% fall 
between the mean and plus or minus three SDs (Gronlund, 1971, p. 387). These facts 
are useful to researchers because internally scaled data often are normally or nearly 
normally distributed. Put differently, waen these and related rules work, scores can 
be considered distributed normally. Thus, it is useful to know when data constitute 
a normal distribution. 

What is a normal curve and what does it look like ? 
The normal curve was investigated in the eighteenth century by 
mathematicians who were, asked by gamblers, interested in winning gambling . 
games, what the probabilities of certain outcomes were. Their chances of winning 
were represented by a curve (Downie & Heath, 1965). This work was elaborated by 
others and is widely used today. Downie and Heath point out the following 
assumptions made about th? normal curve: 

In oui educational and psychological work, we assume that certain traits are 
normally distributed. In actuality, probably no distribution ever takes on the 
absolute form of the normal distribution. Many of our frequency 
distributions are very close to the normal one, and we assume that they have 
a normal distribution. To the extent that our distributions differ from 
normal, error enters into our work. The normal curve is important not 
primarily because scores are assumed to be normally distributed, but because 
the sampling distributions of various statistics are known or assumed to be 
normal, (p. 69) 
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The first users of the normal curve believed that almost all human 
characteristics were distributed in a random fashion around an average value. 
These human characteristics included intellectual and moral qualities as well, and 
this thought, that somehow abilities are naturally distributed in a normal way, has 
carried over to mental measurement (Nitko, 1983). This is one reason why the 
normal curve is studied, analyzed, and thought to be important in statistics. 

The normal curve has been described as a mathematical model defined by a 
particular equation that depends on two specific numbers: the mean and the 
st: dard deviation, signifying that many normal distributions exist and each has a 
different mean or standard deviation (Nitko, 1983). These two statistics are then 
used to calculate two additional statistics that are used to evaluate normalcy: 
skewness arid kurtosis. These four elements, the mean, the standard deviation, the 
skewness, and the kurtosis, are called the first four moments of a normal 
distribution. 

Why are these statistics referred to as "moments"? 
A moment is a mechanical term for the measure of a force with reference to its 
tendency to produce rotation. This tendency to produce rotation is related to the 
amount of the force applied and the distance from the origin thai this force is 
exerted (Mills, 1955). In Figure 1 we see eight pounds and two pounds representing 
the forces applied in a given situation. The eight pounds of pressure being exerted 
on the point one foot above the origin at zero is balanced by a force of two pounds 
being exerted four feet below the origin. The sum of the moments tending to cause 
rotation in one direction is equal to the sum of the moments tending to cause 
rotation the opposite direction, so the object is balanced (Odell, 1957). If either of 
these points was moved or if the pomt of origin was moved, the sum of the forces 
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that are measured by the moments would not be zero and the object would not be in 
balance. 
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Fi gure 1 . The relationship of weight and distance on the balance of the line. 



In statistics, the term "moment" denotes class frequencies that are analogous, to 
the forces exerted in the previous example. In Figure 2 we see a histogram for a test 
in which the mean is 104 and in which there are 90 grades. If each of the columns is 
thought of as a solid rectangle. With each column exerting a pressure on the X-axis, 
we can see the contribution of the forces. The "moment" contribution of each 
column is measured by the product of the class frequency (f) and the corresponding 
deviation (xj from the origin. The sum of the fx products, divided by the total 
frequencies, gives a net measure called the first moment or the mean (Mills, 1955). 
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Fi gure 2 . Class frequencies, pressures on the x-axis, and the balance around the mean. 



The second moment of a statistical series is the standard deviation. It is a 
measure of the variation of the scores around the mean in standard units. Slight 
differences in patterns of variation are reflected in the moments which define the 
degree and character of the variation. 

The third and fourth moments, skewness and kurtosis, are directly related to 
the standard deviation. Skewness and kurtosis quantitatively indicate the 
nonnormal variation in the statistical series. Skewness refers to the asymmetry of 
the curve and kurtosis refers to the tallness or flatness of the curve. Both of these 
moments depend upon the manner in which the scores scatter about the mean. A 
symmetrical curve provides a mirror image from a line drawn through the mean. 
But if the scatter is greater on one side of the mean than on the other side, the 
distril^ution is said to be skewed (Tate, 1965), When the distribution of scores 
extends from the mean further toward the larger values than smaller values of the 



distribution, the distribution is said to be positively skewed or skewed right. When 
the distribution of scores extends from the mean further toward the smaller values 
than larger values of the distribution, the distribution is said to be negatively 
skewed or skewed left. 

The skewness is formulated from the third moment of the distribution because 
it reflects the average of the deviation scores raised to the third power divided by the 
standard deviation raise jl to the third power (Newell & Hancock, 1984). Tl\e 
formula for this is: 

Skewness - £ (X-X)V n 

When all the scores have been converteid to z-scores (X=0; SD=1), we can use a 
much simpler formula and will always get the same answer for a given data set: 

Skewness = I z. ^ , . 

n 

V/hen there is a higher concentration of scores around the mean, the 
distribution is relatively narrow and the curve has positive kurtosis. When there is 
a low concentration of scores around the mean, the distribution is relatively broad 
and the curve has negative kurtosis. Kurtosis is called the fourth moment of the 
distribution because it is the ratio of the average of the deviation scores raised to the 
fourth power to the standard deviation also raised to the foiu'th power. Using this 
formula, the norn a' curve has a kurtosis value of 3, although the common practice 
of researchers and statistics packages now is to subtract 3 from the kurtosis value 
obtained so that zero represents the kurtosis value for a normal curve (Newell & 
Hancock, 1984) just as skewness of 0 implies no skewness relative to the normal 
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distribution. A "tall" or "peaked" curve has a kurtosis value greater than 0, and a 
"flat" curve has a kurtosis value less than 0. The formula for kurtosis is: 

Kurtosis = S (X-X)'^ /n .3 

Alternatively, if the scores are converted to z-scores, we can use a much 
simpler formula that always yields the same answer for a given data set: 

Kurtosis = I .3 
n 

How does all this help us look at a curve and estimate if it is a normal curve? 
Asking if a certain curve looks like a normal curve is like asking, "Does that 
person look tall?", without knowing how fat that person is. So the real question to 
ask is, "Is the curve tall in comparison to its spreadoutness?" We cannot think 
about how tall someone looks without knowing how fat .or skinny they are. So to 
compare people we would need to make them all the same width and then we 
could could easily see their variations and which ones vary from the norm. In 
order to compare test scores and their distribution, we need to make them all the 
same "width" by standardizing them. This can be done by converting the raw scores 
to 2-scores. 

Z-scores are the most basic standard scores and are used to derive other kinds of 
standard scores. Z-scores express raw score performance in terms of the number of 
SD units above or below the mean. "Knowing the z-score of any score enables us to 
determine the percentile rating of the score by comparing it to the properties of the 
standard normal distribution" (Moore,1983, p. 221). Table 1 presents the z-scores for 
the data (Thompson, 1991) that will be employed to illustrate these dynanriics.. 
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INSERT TABLE 1 ABOUT HERE 



By using any spreadsheet application, the raw scores can be quickly converted to 
z scores. U is very helpful if the spreadsheet application that is used has graphing 
capabilities. The columns with range and frequency are used to obtain the 
histogram rc-prjjsenting the curve produced by the scores. 

Table 1 presents the sum of the 100 z-scores; by definition it is zero because the 
mean of z-scores is 0, and this only occurs if the sum of the scores is also 0. The next 
column shows z^ and when these are summed and divided by the number of scores, 
the standard deviation (.9998) is found. The next column produces the' value of 
skewness, i.e, z^ summed (0.0000) and divided by the number of scores is 0.0000. The 
last column produces the value of kurtosis when z-^ ... suinmed (285.3902) and 
divided by the number of scores, i.e., 2.8539. 

Once these values have been obtained we can also graph the frequencies of the 
scores and then manipulate certain values to explore what happens to our curve. In 
the first set of figures presented in Figures 3a., 3b., and 3c, the standard deviation has 
been changed, but the mean is held constant at 50. Though the distributions appear 
to be different in shape, all three of the figures still represent normal curves. 
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Fi gure 3a . The normal curve with a mean of 50 and SD of 15. (chosen by author) 
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Figure 3b. The normal curve with a mean of 50 and SD of 10. (chosen h/ a-ithor) 
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Fi gure 3c . The normal curve with a mean of 50 and SD of 5. (chosen by author) 

We can also take our scores and see what happens when we change both the 
mean and the SD of the 100 scores from Table 1. Figure 4a. presents the same scores 
(X=50 SD =10.05) presented in Figure 3b, for comparison purposes. In Figure 4b. the 
scores have been multiplied by 1.3. Nodce that this spreads the scores out and ■ 
increases the standard deviation, but the kurtosis and the skewness values remain 
the same because these values (skewness and kurtosis) are ratios to SD^ or SD'* and 
the criteria fbr a normal curve have still been met. 
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Figure 4a.The curve wiHi the z scores multiplied by 1.00. Multiply by 1 Mean: 50 
SD (sum of z2/n)=10.05 Skewness (sum of z3/n)=0 Kurtosis(sum of z4/n-3)=-20 




Figure 4b .The curve with the z scores multiplied by 1.30.Multiply by 1.3 Mean: 65 
SD (sum of z2/n)=13.06 Skewness (sum of z3/n)=0 Kurtosis(sum of z*/n-3)=-.20 

In Figures 4c. and 4d. a multiplicative constant less than 1 has been applied. The 
standard deviation goes down and the spread is narrower in both cases, but the 
kurtosis and the skewness values remain the same because, as emphasized before. 
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the skewness and kurlosis values are computed by first converting scores to z form. 
Changes in the SD of the raw scores have no effect on SD of z, which is always 1.0. 
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Fi gure 4c .The ci'.rve with the z scores multiplied by 0.50. Multiply by 0.5 Mean: 25 
SD (sum of z2/n)=5.02 Skewness (sum of z3/n)=0 Kxuiosis(sum of z4/n-3)=-.20 
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Fi gure 4 d.The curve with the z scores multiplied by 0.20. Multiply by 0,2 Mean: 10 
SD(sumofz2/n)=2.01 Skewness (sum of z3/n)=:0 Kurtosis(sumof z4/n-3)=-20 
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By convening the raw scores to z scores and applying shape statistics to them, 
we were able to see that some distributions that look tall were still normal ar.d that 
some distributions that looked flat were also normal. So as a person "eyeballs" a 
distribution of scores to determine if it is normal, the spreadoutness of the scores in 
relation to the height must be considered. That is, "eyeballing" the distribution can 
lead to incorrect conclusions. 
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